23 research outputs found

    Deep Learning Applications for Biomedical Data and Natural Language Processing

    Get PDF
    The human brain can be seen as an ensemble of interconnected neurons, more or less specialized to solve different cognitive and motor tasks. In computer science, the term deep learning is often applied to signify sets of interconnected nodes, where deep means that they have several computational layers. Development of deep learning is essentially a quest to mimic how the human brain, at least partially, operates.In this thesis, I will use machine learning techniques to tackle two different domain of problems. The first is a problem in natural language processing. We improved classification of relations within images, using text associated with the pictures. The second domain is regarding heart transplant. We created models for pre- and post-transplant survival and simulated a whole transplantation queue, to be able to asses the impact of different allocation policies. We used deep learning models to solve these problems.As introduction to these problems, I will present the basic concepts of machine learning, how to represent data, how to evaluate prediction results, and how to create different models to predict values from data. Following that, I will also introduce the field of heart transplant and some information about simulation

    Using Syntactic Dependencies to Solve Coreferences

    Get PDF
    This paper describes the structure of the LTH coreference solver used in the closed track of the CoNLL 2012 shared task (Pradhan et al., 2012). The solver core is a mention classifier that uses Soon et al. (2001)’s algorithm and features extracted from the dependency graphs of the sentences. This system builds on Björkelund and Nugues (2011)’s solver that we extended so that it can be applied to the three languages of the task: English, Chinese, and Arabic. We designed a new mention detection module that removes pleonastic pronouns, prunes constituents, and recovers mentions when they do not match exactly a noun phrase. We carefully redesigned the features so that they reflect more complex linguistic phenomena as well as discourse properties. Finally, we introduced a minimal cluster model grounded in the first mention of an entity. We optimized the feature sets for the three languages: We carried out an extensive evaluation of pairs of features and we complemented the single features with associations that improved the CoNLL score. We obtained the respective scores of 59.57, 56.62, and 48.25 on English, Chinese, and Arabic on the development set, 59.36, 56.85, and 49.43 on the test set, and the combined official score of 55.21.

    Challenges in teaching international students: group separation, language barriers and culture differences

    Get PDF
    Every year, there are more than 2500 new international students coming to Lund University and starting their higher education path. A higher number of foreign students increased the international competitiveness of Lund University, but at the same time the international students had to face many problems for instance, culture shock and language barriers. In this report we focused on issues in teaching international students, specifically group separation, language barriers and cultural differences. The problem of group separation lays in isolation experienced by international students from local students, making it harder for them to communicate with each other. Consequently, the result could be to build up language and culture boundaries between the local students and international students. In order to solve these problems, based on our discussion in the report we suggest few possible techniques

    Combining Text Semantics and Image Geometry to Improve Scene Interpretation

    Get PDF
    Inthispaper,wedescribeanovelsystemthatidentifiesrelationsbetweentheobjectsextractedfromanimage. We started from the idea that in addition to the geometric and visual properties of the image objects, we could exploit lexical and semantic information from the text accompanying the image. As experimental set up, we gathered a corpus of images from Wikipedia as well as their associated articles. We extracted two types of objects: human beings and horses and we considered three relations that could hold between them: Ride, Lead, or None. We used geometric features as a baseline to identify the relations between the entities and we describe the improvements brought by the addition of bag-of-wordf eatures and predicate–arguments tructures we derived from the text. The best semantic model resulted in a relative error reduction of more than 18% over the baseline

    Combining Text Semantics and Image Geometry to Identify Relations

    No full text
    Automatically identifying actions and relations between objects in images can be useful for many applications, for example image and video labeling. In this thesis, the goal is to extract a predefined set of objects from the images and identify the relations linking these objects. Wikipedia is the source of images and texts that are analyzed. I created a program that takes both geometrical information from images, and semantic information from texts and outputs classification of relations for each object in the images. The baseline of only using geometrical features is improved on by using bag-of-words features based on the articles, and further enhanced by utilizing predicate information. Acknowledgments I would like to thank the following people for their involvement in creating this thesis: Pierre Nugues: thanks for being my supervisor and giving guidance and feedback during the thesis. Peter Exner: thanks for guidance and help with the semantic parser

    Applications of Machine Learning on Natural Language Processing and Biomedical Data

    No full text
    Machine learning is ubiquitous in today’s society, with promising applicationsin the field of natural language processing (NLP), so that computers can handlehuman language better, and within the medical community, with the promiseof better treatments. Machine learning can be seen as a subfield of artificialintelligence (AI), where AI is used to describe a machine that mimics cognitivefunctions that humans associate with other human minds, such as learning orproblem solving.In this thesis we explore how machine learning can be used to improve classification of picture, by using associated text. We then shift our focus to biomedical data, specifically heart transplantation patients. We show how the data can be represented as a graph database using the resource description framework (RDF).After that we use the data with logistic regression and the Spark framework, toperform feature search to predict the survival probability of the patients. In thetwo last articles we use artificial neural networks (ANN) first to predict patientsurvival, and compare it with a logistic regression approach, and last to predict the outcome of patients awaiting heart transplantation.We plan to do simulation of different allocation policies, for donor hearts, usingthese kind of ANNs, to be able to asses their impact on predicted earned survivaltime

    Predicting the Outcome for Patients in a Heart Transplantation Queue using Deep Learning

    No full text
    Heart transplantations have made it possible to extend the median survival time to 12 years for patients with end-stage heart diseases. This operation is unfortunately limited by the availability of donor organs and patients have to wait on average about 200 days in a waiting list before being operated. This waiting time varies considerably across the patients. In this paper, we studied the outcome for patients entering a transplantation waiting list using deep learning techniques. We implemented a model in the form of two-layer neural networks and we predicted the outcome as still waiting, transplanted or dead in the waiting list, at three different time points: 180 days, 365 days, and 730 days. As data source, we used the United Network for Organ Sharing (UNOS) registry, where we extracted adult patients (>17 years) from January 2000 to December 2011. We trained our model using the Keras framework, and we report F1 macro scores of respectively 0.674, 0.680, and 0.680 compared to a baseline of 0.271. We also applied a backward elimination procedure, using our neural network, to extract the 10 most significant parameters predicting the patient status for the three different time points

    Simulating the Outcome of Heart Allocation Policies Using Deep Neural Networks

    No full text
    We created a system to simulate the heart allocation process in a transplant queue, using a discrete event model and a neural network algorithm, which we named the Lund Deep Learning Transplant Algorithm (LuDeLTA). LuDeLTA is utilized to predict the survival of the patients both in the queue and after transplant. We tried four different allocation policies: wait time, clinical rules and allocating the patients using either LuDeLTA or The International Heart Transplant Survival Algorithm (IHTSA) model. Both IHTSA and LuDeLTA were used to evaluate the results. The predicted mean survival for allocating according to wait time was about 4,300 days, clinical rules 4,300 days and using neural networks 4,700 days

    Selection of an Optimal Feature Set to Predict Heart Transplantation Outcomes

    No full text
    Heart transplantation (HT) is a life saving procedure, but a limited donor supply forces the surgeons to prioritize the recipients. The understanding of factors that predict mortality could help the doctors with this task. The objective of this study is to find locally optimal feature sets to predict survival of HT patients for different time periods. To this end, we applied logistic regression together with a greedy forward and backward search. As data source, we used the United Network for Organ Sharing (UNOS) registry, where we extracted adult patients (>17 years) from January 1997 to December 2008. As methods to predict survival, we used the Index for Mortality Prediction After Cardiac Transplantation (IMPACT) and the International Heart Transplant Survival Algorithm (IHTSA). We used the LIBLINEAR library together with the Apache Spark cluster computing framework to carry out the computation and we found feature sets for 1, 5, and 10 year survival for which we obtained area under the ROC curves (AUROC) of 68%, 68%, and 76%, respectively

    Streamlining a Transplantation Survival Prediction Program with a RDF Triplestore

    No full text
    In this paper, we describe the conversion of a heart trans- plantation dataset in the RDF format and its application in a survival prediction program. The International Society for Heart & Lung Trans- plantation (ISHLT) maintains a registry of heart transplantations that it gathers from operations performed worldwide. The US United Net- work for Organ Sharing (UNOS) and the Scandinavian Scandiatrans- plant are contributors of this registry although they use different data models. We designed a unified graph representation covering the three data models and we converted the ISHLT tables into RDF triples. We used the resulting triplestore as input to a survival prediction program for transplanted patients [3, 2]. Recipient and donor properties are essen- tial to predict the mortality or survival of the patients. In contrast with the manual techniques we used to extract data from the tabulated files, the RDF triplestore enables us to experiment quickly and automatically with combinations of features that influence most the survival that we will demonstrate at the conference
    corecore